A Power Management Proxy with a New Best-of-N Bloom Filter Design to Reduce False Positives Funding acknowledgment goes here
نویسندگان
چکیده
Bloom filters are a probabilistic data structure used to evaluate set membership. A group of hash functions are used to map elements into a Bloom filter and to test elements for membership. In this paper, we propose using multiple groups of hash functions and selecting the group that generates the Bloom filter instance with the smallest number of bits set to 1. We evaluate the performance of this new Best-of-N method using order statistics and an actual implementation. Our analysis shows that significant reduction in the probability of a false positive can be achieved. We also propose and evaluate a new method that uses a Random Number Generator (RNG) to generate multiple hashes from one initial “seed” hash. This RNG method (motivated by a method from Kirsch and Mitzenmacher) makes the computational expense of the Best-of-N method very modest. The target application is a power management proxy for P2P applications executing in a resource-constrained “SmartNIC”.
منابع مشابه
A Cuckoo Filter Modification Inspired by Bloom Filter
Probabilistic data structures are so popular in membership queries, network applications, and so on. Bloom Filter and Cuckoo Filter are two popular space efficient models that incorporate in set membership checking part of many important protocols. They are compact representation of data that use hash functions to randomize a set of items. Being able to store more elements while keeping a reaso...
متن کاملReducing False Positives of a Bloom Filter using Cross-Checking Bloom Filters
A Bloom filter is a compact data structure that supports membership queries on a set, allowing false positives. The simplicity and the excellent performance of a Bloom filter make it a standard data structure of great use in many network applications. In reducing the false positive rate of a Bloom filter, it is well known that the size of a Bloom filter and accordingly the number of hash indice...
متن کاملThe Gaussian Bloom Filter
Modern databases tailored to highly distributed, fault tolerant management of information for big data applications exploit a classical data structure for reducing disk and network I/O as well as for managing data distribution: The Bloom filter. This data structure allows to encode small sets of elements, typically the keys in a key-value store, into a small, constant-size data structure. In or...
متن کاملAn approximate dynamic programming approach for improving accuracy of lossy data compression by Bloom filters
Bloom filters are a data structure for storing data in a compressed form. They offer excellent space and time efficiency at the cost of some loss of accuracy (so-called lossy compression). This work presents a yes–no Bloom filter, which as a data structure consisting of two parts: the yes-filter which is a standard Bloom filter and the no-filter which is another Bloom filter whose purpose is to...
متن کاملAutoscaling Bloom Filter: Controlling Trade-off Between True and False Positives
A Bloom filter is a simple data structure supporting membership queries on a set. The standard Bloom filter does not support the delete operation, therefore, many applications use a counting Bloom filter allowing the deletion. This paper proposes a generalization of the counting Bloom filters approach, called “autoscaling Bloom filters”, which allows elastic adjustment of its capacity with prob...
متن کامل